Relieving Both Storage and Recovery Burdens in Big Data Clusters with R-STAIR Codes

نویسندگان

  • Mingqiang Li
  • Runhui Li
  • Patrick P. C. Lee
چکیده

Enterprise storage clusters increasingly adopt erasure coding to protect stored data against transient and permanent failures. Existing erasure code designs not only introduce extra parity information in a storage-inefficient manner, but also consume substantial cross-rack recovery bandwidth. To relieve both storage and recovery burdens of erasure coding, we adapt our previously proposed STAIR codes into recoveryoriented STAIR (R-STAIR) codes, which achieve storage efficiency, recovery efficiency, and configuration generality against a mix of node and rack failures. We evaluate R-STAIR codes via analysis and Hadoop experiments. We show that by supporting mixed fault tolerance, R-STAIR codes can significantly reduce both storage and recovery burdens in storage clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

STAIR codes: a general family of erasure codes for tolerating device and sector failures in practical storage systems

Practical storage systems often adopt erasure codes to tolerate device failures and sector failures, both of which are prevalent in the field. However, traditional erasure codes employ device-level redundancy to protect against sector failures, and hence incur significant space overhead. Recent sector-disk (SD) codes are available only for limited configurations due to the relatively strict ass...

متن کامل

Hybrid Regenerating Codes for Distributed Storage Systems

Distributed storage systems are mainly justified due to their ability to store data reliably over some unreliable nodes such that the system can have long term durability. Recently, regenerating codes are proposed to make a balance between the repair bandwidth and the storage capacity per node. This is achieved through using the notion of network coding approach. In this paper, a new variation ...

متن کامل

XORing Elephants: Novel Erasure Codes for Big Data

Distributed storage systems for large clusters typically use replication to provide reliability. Recently, erasure codes have been used to reduce the large storage overhead of threereplicated systems. Reed-Solomon codes are the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability. This paper shows how ...

متن کامل

HFR code: a flexible replication scheme for cloud storage systems

Fractional repetition (FR) codes are a family of repair-efficient storage codes that provide exact and uncoded node repair at the minimum bandwidth regenerating point. The advantageous repair properties are achieved by a tailor-made two-layer encoding scheme which concatenates an outer maximum-distanceseparable (MDS) code and an inner repetition code. In this paper, we generalize the applicatio...

متن کامل

Oil Reservoirs Classification Using Fuzzy Clustering (RESEARCH NOTE)

Enhanced Oil Recovery (EOR) is a well-known method to increase oil production from oil reservoirs. Applying EOR to a new reservoir is a costly and time consuming process. Incorporating available knowledge of oil reservoirs in the EOR process eliminates these costs and saves operational time and work. This work presents a universal method to apply EOR to reservoirs based on the available data by...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016